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SYNTHETIC CHYMOSIN- AND PRQCHYMOSIN-EKCODIKG DNA SEGMENTS 

Tne present invention relates generally to molecular 
biology. It pertains, more particularly, to synthetic genes and 
host transformation therewith. 

BACKGROUND OF THE INVENTION 

One goal of molecular biology research is to provide 
alternative, commercially viable sources of proteins and enzymes 
for industrial use. This goal is satisfied in part by 
manipulating cells which can be maintained economically to 
produce proteins otherwise too costly or impractical to 
produce. Recombinant DNA technology provides the tool by which 
the protein-encoding genes are transferred from one cell to 
another to obtain protein production in the desired host. While 
this has been accomplished on many occasions, there remains the 
goal of obtaining more efficient expression of the foreign gene 
in the transformed host. 

In conventional r DNA techniques, a gene from One 
species is often used to transform a cell of a different 
species, genus, family or even kingdom, without regard for the 
compatibility of the gene with its foreign host or, more 
importantly, for the ability of the transformed host to express 
the foreign gene as efficiently as is possible. For example, 
synthesis of the protein encoded by the new gene may be hindered 
at the translational level if the required tRNA species are not 
present in sufficient quantity to meet the needs of translating 
a foreign gene efficiently, 

* 

In fact, it has been established that organisms do not 
utilize all possible codons with equal frequency. Given that 
almost all amino acids are encoded by two or more codons, e.g. 
serine is encoded by six different codons, threonine by four 
different codons etc., this selective codon usage i.e. codoh 
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bias, can affect the level at which the foreign gene is 
expressed particularly when the source of the gene and the host 
cell are genetically unrelated or, at least, exhibit distinctly 
different codon bias. 

This codon bias may be present, in a subtle way, in all 
genes of a given organism, but is most pronounced in genes which 
are expressed to very high levels. The codon selection may be 
extremely severe in these cases. For example, one study 
(3ennetzen and Hall, 1982) on the yeast highly expressed genes 
for alcohol dehydrogenase I (ADH-I) and glyceraldehyde-phosphate 
dehydrogenase showed that 96% of the amino acid residues were 
encoded by only 25 out of the possible 61 coding triplets. This 
"streamlining"- of the codon usage makes it possible to express 
these genes to very high levels, since the time taken to decode 
the mRI3A will clearly be the minimum possible, with no delays to 
accomodate the effects of relatively rare tRNAs. 

The codon bias for highly expressed genes appears to be 
a distinct feature in ail genomes. However, the codon bias from 
organism to organism is. not necessarily the same. For example, 
although E . col i exhibits a very severe codon Dias for highly 
expressed genes such as those encoding either the major 
lipoprotein or the elongation factors (Guoy and Gaucier, 1982), 
the actual . codons used may be different from those which are 
highly preferred in another organism, such as yeast (Bennetzen 
and Hall, 1982) . 

Although the art suggests that codon bias disparity 
exists between different organisms, this knowledge has yet to be 
applied in a practical way with a view to enhancing gene 
expression in. transformed hosts. 

DESCRIPTION OF THE PRIOR ART 

In the art, attempts at enhancing levels of gene 
expression focus almost entirely on the identification, 
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manipulation and modification of gene elements such as promoter 
regions, secretion signals and termination regions or other gene 
elements which exert some transcription or 

translation-controlling function over the protein-encoding 
region of a gene which is usually genome-derived or cDNA basea. 

A typical example is the approach taken to develop 
rDNA-based biological systems for production of the enzyme known 
as chymosin (also known as rennin). 

Chymosin is an aspartyl protease (EC 3 . 4 . 2 3. 4) _ which is 
normally found in the fourth stomach of the unweaned calf, where 
it functions in the clotting of milk. It is a secreted protein, 
initially synthesized as a longer precursor preprochymosin. 
Preprochymosin has a 16 amino acid signal peptide at its amino 
terminus which is cleaved, upon secretion, to yield tne zymogen 
prochymosin. 

The clotting property of chymosin is very important 
commercially , since chymosin is the preferred milk coagulant for 
the cheese industry. Unfortunately, because of the decreasing 
world-wide demand for veal, the supply of calf stomach from 
which to recover the enzyme is declining. Accordingly, 
recombinant DN A technology has been applied to provide a 
sufficient, stable supply of chymosin. 

Cloning of the chymosin gene in E. col i was reported Dy 
Beppu et al in J. Biochem 90: 901-904 (1981). The nucleotide 
sequence of calf prochymosin cDNA was reported later by Beppu in 
J. Biochem 91. 1035-1088 (1982). The development of chymosin 
vectors progressed with the disclosure of an E. coll expression 
plasmid having the lacUVS promoter (19 82) or the tryptophan 
pr omo t e r . 

Transformation vectors suitable for. expression of 


prochymosin by yeast are described in Gene 24 pp l-14 r (1983) 
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which discloses a vector in which prochymosin cDNA is under trie 
influence of the yeast PGK gene; in Science, Vol. 229 pp 
1219-1223 (September 1985) which describes use of the yeast 
invertase secretion signal in a chymosin cDNA vector and in Gene 
27 pp 35-36 (1984) which describes a vector in which chymosin 
cDNA is under control of the yeast GALl promoter and contains 
the yeast SUC2 transcriptional terminator fragment. 

In all of the above disclosures, the chymosin or 
prochymosin fragment on the vector is derived from bovine cells 
i.e. is either genomic DNA, or is a cDNA copy of the 
corresponding messenger RNA. No experiments are described which 
attempt to reconcile the codon bias of the new host i.e. yeast 
or E. coli as opposed to bovine stomach cells, with the codon 
usage in the chymosin encoding fragment contained on the 
vector. Similar approaches have been taken in creating hosts 
engineered co express other foreign proteins. Although some 
gene elements are manipulated and substituted, the protein 
coding segment used is a natural, unaltered segment. 

SUMMARY OF THE PRESENT INVENTION 

In the present invention, use is made of protein coding 
regions which code for authentic proteins or polypeptides but 
which have been wholly or partially synthesized according to the 
codon bias of the cell which is to be transformed. Once 
synthesized, the coding region may be coupled with other gene 
segments, as desired e.g. a promoter region, a secretion signal 
sequence, a termination region etc. and combined if necessary 
with a suitable marKer, replication origin and the like in order 
to form a suitable transformation vector. Since, by design, the 
synthetic coding region which encodes the protein of interest is 
comprised of codons for which the eventual host has a 
preference, translation of the synthetic protein coding region 
is able to progress at an enhanced— ra te -compared with the rate 
at which translation of an authentic out foreign coding region 
would occur in the same host. 
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Thus, according to one aspect of the present invention, 
there is provided a protein-encoding DNA segment having a base 
sequence optimized for expression in a host to whicn said 
protein is foreign. 

In accordance with further aspect of the present 
invention, there are provided vectors containing the 
protein-encoding DNA segment, and host cells transformed 

therewith. 

In accordance with aspects of the present invention 
which are preferred herein, there is provided a chymosin- or 
prochymosin-encoding DNA segment having a. base sequence 
optimized for expression in a foreign host. Vectors containing 
these synthetic coding regions and hosts transformed therewith 
are also witnin the scope of the present invention. 

in the present invention, the protein coding region is 
optimized by substituting codons defined by the natural 
nucleotide sequence with codons which are preferred by the 
intended recipient i.e. host cell wnicn is to be transformed. 
All or portions of the optimized coding region may be produced 
synthetically, the portions then being coupled to the remaining 
natural segment, if any, to form an entire optimized coding 
region. In this sense, the synthetic coding region and the 
authentic coding region are analogous in that they encode the 
same or substantially the same amino acid sequence. (By 
-substantially the same", it is meant that the amino acid 
sequences are functionally equivalent in terms of utility or 
activity.) However, some number of codons defined by the 
synthetic coding region have b.een altered by comparison with the 
natural coding region so as to reflect the codon bias of the 
intended host. 

" it is believed that" supplanting-df-even one codon: f or a 
more preferred codon will affect the level at which the coding 
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region is expressed by enhancing that level to some extent. It 
may be more desirable to alter the codon usage to a greater 
extent, however with maximum advantage being realized when all 
codons of a given coding region are optimized for the host. 

in order to synthesize a protein coding segment 
optimized for a particular host, knowledge of the codon oias of 
the intended host is required- Provided that the nucleotide 
sequence of at least a few genes whose protein products are 
highly expressed oy the natural host are known, this information 
can be calculated. From the nucleotide sequence, it is 
currently a simple matter to determine the corresponding amino 
acid sequence and to tabulate the frequencies at which the 
codons are used; When .codon usage is compared with amino acid 
usage, a pattern of codon bias emerges.- With the benefit of 
this data, a coding region may ue synthesized which utilizes 
only codons preferred by the particular host whose codon 
preference has been determined. Indeed, once this pattern is 
determined, the application of the principle is simplified. All 
that is then required is to identify the amino acid sequence of 
the protein encoded by the foreign gene with which the host is 
to be transformed. Once the amino acid sequence of the protein 
encoded by the foreign gene is determined, a gene optimized for 
expression by the new host can be synthesized by incorporating 
only those codons which are the codons preferred by the new host 
for the corresponding amino acids, where enhanced expression is 
desired. 

Despite the specific tailoring of the coding region 
with respect to one particular host, it has been found that the 
optimized coding region can be^ expressed not only in the 
particular host species for which it was designed, but also in 
other hosts. For example, it has been found that a 
protein-coding region optimized specifically for expression in 
yeast can also be - expressed in- filamentous- fungi without further 
alteration. Accordingly, the optimized protein-encoding region 
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UNA segments of the present invention are useful in obtaining 
expression from hosts other than those specific hosts for which 
the optimized segment was originally intended. 

The hosts with which the present invention is concerned 
primarily are the fungi, including the filamentous fungi and the 
yeasts. Given the general principle underlying the present 
invention, it will be appreciated that other hosts are included 
within its scope. 

Thus, the present invention provides a synthetic 
protein-encoding DMA segment which is analogous to a natural 
such segment wherein the synthetic segment is optimized for 
expression in a particular host celi foreign to the natural 
segment. 

As "protein encoding DNA segments" there may be 
mentioned signal sequences, sequences encoding mature proteins, 
entire coding regions, segments thereof including polypeptides 
and the like. 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

Given their well established role in industrial 
application of recombinant DNA technology and efficient 
secretive capacity , the yeasts are most preferred as hosts for 
the optimized sequences including Saccharomyces sp . such as 
carlbergensis and cerevisiae and the species of the genera 
P ichia and Candida . Filamentous fungi e.g. Aspergillus sp . are 
also candidates for application of the principle expounded 
herein, particularly A^ niger and A. nidulans . Of particular 
preference is the use of Saccharomyces cerevisiae as host. 

The codon bias of cerevisiae was determined using 
the procedure descrioed generally-'aoove The results are shown 

in Table I oelow: 
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TABLE I 

Codon Bias of S. cerevisiae in Highly Expressed Genes 


Amino Acid 

Coaon 

% Occurrence 
Yeas t 

Gl V 

GGG 
\j \j \j 

0 

Gl V 

GG A 

\J 4l 

1 

Gly 

GGT 

97 

Gly 

GGC 

2 

Gl U 

GAG 

J» 

2 

Gl 11 

GA A 

9 8 

. Asp 

GAT 

29 

As p 

GAC 

71 

Val 

GTG 

0 

Va 1 

GTA 

0 

Va 1 

GTT 

52 

Val 

GTC 

48 

Al a 

GCG 

0 

Al a 

GCA 

0 

Ala 

GCT 

7 6 

Al a 

GCC 

24 

Arg 

AGG 

0 

Ar g 

AG A 

92 

Arg 

CGG 

0 

Ar Q 

CGA 

0 

Ar q 

CGT 

8 

Arg 

CGC 

0 

Ly s 

AAG 

91 

Lys 

AAA 

9 

Asn 

AAT 

3 

As n 

AAC 

9 7 

Met 

ATG 

100 

He 

ATA 

0 

lie 

. ATT 

43 

lie 

ATC 

57 

Thr 

ACG 

1 

Th r 

AC A 

0 

Thr 

ACT 

41 

: Th r , - 

. . ACC : 

59 

Tr p 

TGG 

100 
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TA3LE I (cont 1 d) 
Cojuh Bias of S. cerevisiae in Highly Expressed Genes 


o Aci d 

Cod on 

% Occurrence 
Yeas t 

cys 

TGT 

9 5 

cys 

TGC 

5 

iy l . 

1 t\ 1 

1 

Tyr 

TAG 

99 

Pn e 

ill 

4 

Phe 

TTC 

9 6 

be r 

J. 

1 

ser 

TCA 

2 

Ser 

TCT 

51 

Ser 

TCC 

4 5 

Ser 

ACT 

0 

Ser 

AGC 

1 

n 

C Aft 
ti \j 

0 

Gin 

CAA 

10 0 

His 

CAT 

3 

His 

CAC 

97 

Leu 

CTG 

0 

Le u 

CTA 

2 

Leu 

CTT 

0 

Leu 

CTC 

0 

Leu 

TTG 

91 

Leu 

TTA 

7 

Pro 

CCG 

0 

Pro 

CCA 

93 

Pro 

CCT 

6 

Pro 

CCC 

1 


This codon preference data was derived from the 
nucleotide sequences of several very highly expressed yeast 
genes: phosphoglycerate kinase (Hitzeman et al., 1982) alcohol 
dehydrogenase ( Bennetzen and Hall, 198 2) , enolase A and B 
(Holland and Holland, 1981), glycer aldehyde-3-phosphate 
dehydrogenase A and 3 (Holland and Holland, 1981) and pyruvate 
kinase ( Burke" e t— al-vr— 19 83H ~ 
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Evidence that the codon bias exists in cerevisiae is 
clear in the Table. For example, although glycine has four 
synonymous codons, the codon GGT is preferred dramatically; 
while a rginine has six synonymous codons, the AG A triplet is 
almost always preferred and of the six synonymous codons for 
leucine, TTG is by far the most often utilized. It is believed 
that this codon bias correlates directly with the availability 
of complementary tRNA species within the tRNA population of the 
S. cerevisiae genome. It will be appreciated therefore that 
when a foreign gene is inserted into the S . cerevisiae genome 
which does not possess the codons preferred by this host, there 
will be disparity at the transla tiona 1 level of protein 
synthesis, between the preferred codons and the availability of 
appropriate tRNA species. This can impact on tne level at which 
the product of the foreign gene is expressed by the host. 

A striking example of differences in codon preference 
is observed when the codon preference in highly expressed yeast 
genes is compared with the codon preference in the prochymosin 
gene. As described above, chymosin (rennin) is an aspartyl 
protease (EC 3.4.23.4) which is normally found in the fourth 
stomach of the unweaned calf. Thus, the yeast genome and the 
prochymosin source are essentially unrelated genetically. A 
comparison of codon usage in yeast genes and the actual codon 
usage in the prochymosin gene appears in Table II below: 
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TABLE II 


Comparison o f Codon Bia s : 
Bovine Prochymosin vs. Yeast Highly Expressed Genes 


Am i n o Ac i d 


Codon 


% Occurrence 
Bovine Prochymosin 


Yeast 


Gly 

GGG 

39 

0 

Gly 

GGA 

3 

J- 

Gly 

GGT 


<5 7 

Gly 

GGC 

4 8 

St/ 


Glu 

GAG 

8 6 

2 

Gl U 

*X-i ^™ 

GAA 

14 

9 8 

Asp 

GAT 

10 


AS D 

GAC 

90 

71 

Val 

GTG 

5b 

U 

Va 1 

V v>4 .A. 

GTA 

7 

0 

Val 

GTT 

10 

52 

Val 

GTC 

28 

48 

Ala 

GCG 

5 

a 
\j 

Ala 

. GCA 

0 

U 

Ala 

GCT 

16 

76 

Ala 

GCC 

79 

24 

Arg 

AGG 

a *7 
o / 

0 

Arg 

AGA 

1 X 

9 2 

Arg 

t~> /~* 

n 
u 

o 

Arg 

CGA 

ii 

0 

Arg 

CGT 

0 

8 

Arg 

CGC 

11 

0 

Lys 

AAG 

60 

91 

Lys 

AAA 

40 

9 

Asn 

AAT 

25 

3 

Asn . 

AAC 

75 

97 

Met 

ATG 

100 

10 0 

He 

ATA 

5 

0 

lie 

ATT 

9 

4 3 

lie 

ATC 

86 ' 

v: "5 7 

Thr 

ACG 

8 

: 1 

'Thr 

AC A 

21 

0 

Th r 

ACT 

17 

41 

Thr 

ACC 

54 

59 
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TABLE II (cont'd) 

Compa rison o f Cod on Bias: 
Bo vi ne Prochy mosin vs . Yeast Highly Ex pressed Genes 

% occur rence 


Am i n o Ac i d 

Cod on 

Bovi ne Pr o cnymos i n 

Yeas 

Tr p 

TGG 

100 

100 

Cys 

TGT 

57 

95 

Cy s 

TGC 

4 3 

5 

Tyr 

TAT 

23 

1 

Tyr 

TAC 

77 

9 9 

Phe 

TTT 

30 

4 

Phe 

TTC 

7 0 

9 6 

Se r 

TCG 

11 

1 

Se r 

TCA 

3 

2 

Se r 

TCT 

11 

51 

Se r 

TCC 

28 

45 

Ser 

ACT 

11 

0 

Ser 

AGC 

3 6 

1 

Gl n 

CAG 

88 

0 

Gl n 

CAA 

12 

100 

HIS 



3 

His 

CAC 

67 

97 

Leu 

CTG 

64 

0 

Leu 

CTA 

6 

2 

Leu 

CTT 

6 

0 

Leu 

CTC 

21 

0 

Leu 

TTG 

3 

91 

Leu 

TTA 

o 

7 

Pro 

CCG 

19 

0 

Pro 

CCA 

6 

93 

Pro 

CCT 

6 

6 

Pro 

CCC 

69 

1 

The prochymosin data 

Was derived from the 

sequence 1 


bovine prochymosin B as published by Harris ej: a^. , 1982. 

It is clear from Table II that while yeast have a bias 
for the AGA codon, the prochymosin gene utilizes the AGG codon 
to code for the same amino acid, glycine. The preferred 
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glutamine residue in the prochymosin gene is CAG whereas in 
yeast genes, it is CA A . Other differences in codon preference 
are equally apparent from Table II. 

It will be appreciated from the above that the coding 
region of the natural prochymosin gene is not well suited for 
expression in yeast, in terms of codon usage. 

To prepare an analogous gene, in which codon usage is 
optimized for expression in yeast, nucleotide synthesizers are 
preferably employed- Entire genes may be constructed if 
desired, although, since only some of the codons need be 
substituted to enhance expression to some extent, synthetic 
segments in which optimized codons are utilized may be prepared 
and used to replace the corresponding native portion of the 
selected gene. The present inventors have been able to 
synthesize an entire gene consisting of over 2,200 nucleotides 
suggesting that, with modern techniques, abilities to synthesize 
long genes entirely should not be limiting to the application of 
the principle of the invention. . 

It should be recognized that, when synthesizing the 
optimized coding region it is possible to delete codons or to 
substitute one or more codons of a coding region by other than 
the most preferred codons to produce a structurally modified 
polypeptide but one which has substantially the same activity or 
utility . 

The synthetic coding region particularly preferred 
herein is a prochymosi n-encodi ng segment which is described in 
greater detail in the examples.. Since, in constructing, the 
synthetic prochymosin coding region, a synthetic chymosin coding 
region is also created, it will be appreciated that the present 
invention also comprises a chyinosi n-encodi ng DNA segment having 
_a base sequence which is optimized for expression in a foreign 
host. in constructing this coding region, codons preferred by 
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cerevisiae were incorporated for the most part although 
other, less preferred codons were employed sparingly in order to 
incorporate restriction sites within the coding region at 
locations which permit convenient ligation of other gene 
segments to form a suitable transformation vector containing the 
optimized coding region. 

Transformation vectors suitable herein will incorporate 
a promoter region coupled with the optimized coding region so as 
to regulate transcription of the coding region. Since the 
preferred hosts are yeast and the filamentous fungi, the 
promoter will preferably be any one of the promoters whose 
utility in those hosts has been established. Where the intended 
host is a yeast, e.g. cerevisiae or t;^ c ar lsberg ens is , the 
promoter may be selected from the promoter regions of the 
phosphoglucokinase gene, or the GAL1 , GAL7, GALlO, invertase or 
melibiase genes. The promoter region of the melibiase gene 
described in International publication number WO86/03777 
puolished July 3, 1986 is preferred for use in yeast 
transformation vectors. . 

Where the intended host is a filamentous fungus, such 
as Aspergillus sp . including the nig er and nidulans species 
which are preferred filamentous fungus hosts, the promoter 
region is preferably derived from any one of the following 
Asperg illus genes; g lucoamylase , alcohol dehydrogenase I and 
aldehyde dehydrogenase. These promoter regions, and vectors 
containing them are described in greater detail in co-pending 
U.S. patent application serial number 811,404 filed December. 20, 
1985 which is incorporated herein by reference. 

* 

The transformation vectors suitable herein preferably 
comprise a secretion signal sequence in reading frame with and 
preferably directly fused with the optimized coding region, 
serving to signal secretion. of the protein once translated. 
Suitable signal sequences for yeast hosts include the secretion 
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signal of the melibiase gene described in the international 
publication cited above. Signal sequences useful in filamentous 
fungi include the signal sequence of the g lucoainy lase gene or a 
synthetic consensus signal sequence, both of which are described 
in the U.S. patent application cited above. 

It will be appreciated, as well, that the natural 
signal associated with pr e-prochy mos in i.e. the "pre-" region 
may be used as a secretion signal in the vectors of the present 
invention. More preferably, however, the "pre-" region base 
sequence is optimized in the manner set forth herein with 
respect to the prochymosin and chymosin base sequences. 

In a particularly preferred embodiment, the selected 
optimized coding region encodes prochymosin although it is 
emphasized that the optimized coding region of chymosin i.e. the- 
optimized prochymosin region from which the first 42 5' codons 
have been removed (see Figure 1, arrow A) may also be used to 
obtain expression of mature, active chymosin. The prochymosin 
coding region is incorporated, preferably, on any one of a 
number of suitable vectors available from the American Type 
Culture Collection and used to transform the selected host. For 
example, the pGL2 plasmids comprise the promoter region and 
signal sequence of the A. niger glucoamylas e gene. Accordingly, 
the optimized prochymosin coding region can be spliced into the 
vector, in reading frame with the signal sequence and used to 
transform a filamentous fungus host, preferably either A_^ 
nidulans or A. niger . Aspergillus sp. transformation can also 
be accomplished using the plasmid pALCALlS which comprises the 
promoter of the alcohol dehydrogenase I gene of A. nidulans and 
a synthetic consensus signal sequence, once the optimized 
prochymosin gene is appropriately incorporated therein. 

Yeast transformation is most preferably accomplished 
using the plasinid p4 which comprises the promoter region and 
signal sequence of the melibiase gene of cerevisiae , once the 
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optimized chymosin region is appropriately incorporated into the 
p4 vector. 

The specific plasmids described above i.e. the pGL2 
series of plasmids, pALCAlS and p4 are on deposit with the 
American Type Culture Collection in Rockville, Maryland,. U.S.A. 
as follows: 


Pi asmid 


ATCC Accession # 


Deposit Da te 


pGL2A 
pGL2B 
PGL2C 
pALCAlS 

?4 


53365 
53366 
53367 
53368 
53360 


December 16 , 198 5 

December 16, 1985 

December 16, 1985 

December 16, 1985 

December 16, 1985 


Plasmids containing the optimi2ed prochymosin coding 
region, the construction of which is described in detail 
hereinafter, were deposited with. ATCC in January, 1987 in 
E. col i host DHl and have been allotted the following accession 
numbers: 

pMV-l/CHYM 105 ATCC 67294 

pALCAlS/CHYM 10 3 ATCC 67295 

pG L2C.CHYM 101 ATCC 6 729 6 

Yeast hosts have been successfully transformed with 
plasmid pMV-l/CHYM 105 and* f ilamentous fungus hosts have been 
successfully transformed with plasmids pGL2C/CHYM 101 and 

pALAlS/CHYKi 10 3. 

An embodiment of the invention is described hereinafter 
by way of example only with reference to the accompanying 
drawings in which: 
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Figure 1 represents, on three sheets, the nucleotide sequences 

of both the authentic and the optimized prochymosin A 
coding region (lower nucleotide sequence) as well as 
the amino acid sequence corresponding thereto, using 
conventional abbreviations for nucleotides and amino 
acids ; 

Figure 2 illustrates, on 5 sheets, the plurality of synthetic 

oligonucleotides which were coupled to form the 
optimized prochymosin coding region; 

Figure 3 illustrates schematically the creation of plasmid 

pCHYMlA(Y)B2 and pCHYMlAB2-3 which comprise portions of 
the optimized prochymosin coding region; 

Figure 4 illustrates the creation of plasmid pCHYM345-21 which 

comprises anotner portion of the optimized prochymosin 
coding region; 

Figure 5 illustrates, in general, terms, the creation of 

transformation vectors which incorporate the entire 
optimized prochymosin coding region; 

Figure 6 is a plasmid map of prtV-l/CH YM1Q5 ; 

Figure 7 is a plasmid map of pGL2C/CHYM101 ; and 

Figure 8 is a plasmid map of pAl cAlS/CH YM10 3 . 

In the example which follows, the bovine prochymosin 
gene and, consequently, the chymosin segment thereof, has been 
totally chemically synthesized in order to incorporate a codon 
bias which matches that of the yeast, Saccharomyces cerevisiae. 
This will unblock a potent ially r ate-1 imiting translation step 
and permit a level of Expression that might not otherwise be 
attainable using a cDNA-based copy of the gene. 
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The difference in codon usage between the bovine 
prochymosin gene and a group of highly expressed yeast genes, is 
indeed quite dramatic. Referring to Table II, supra , it is 
readily seen that many of the codons in the natural prochymosin 
gene are among the worst choices for efficient expression in 
yeast, especially those for Gly, Glu, Ala, Arg , Gin, Leu and 
Pro, and to a marked but lesser extent for such amino acids as 
Val, Lys, lie, Cys and Ser. 

By "chemically synthesizing the entire coding region of 
the gene for bovine prochymosin, all of those codons which would 
otherwise result in less efficient prochymosin production in 
yeast are replaced. Tnis represents a very significant 
improvement to the conventional strategy of expressing a cDNA or 
genomic clone version of the particular gene, especially in a 
micro-organism which is capable of high level expression when 
not otnerwise constrained by poor codon selection. 

The codon-optimized synthetic boyine prochymosin gene 
was originally derived from the published amino acid sequences 
of prochymosin B (Pederson ejt a_l . , 1975; Foitmann et al . , 
1979). This full peptide sequence was "reverse translated", by 
a standard computer program (Devereux et_ al_, 1984) back into 
very limited codon assignments. For each amino acid, only the 
most frequently used yeast codon, from yeast highly expressed 
genes, (i.e. the codons with the highest usage frequencies in 
Table I, supra ) was utilized. This produced a full sequence 
encoding bovine prochymosin B which was totally optimized for 
yeast high level expression. 

This gene sequence is- but one example of the invention 
herein claimed. For the particular embodiment described herein, 
however, a few further (minor) modifications were made. 

~~_ Sojhe" ""of "the nucleotide positions were further changed 

to allow the creation of desirable restriction enzyme 
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recognition sites, which were deemed useful for ease of cloning, 
sequencing and other subsequent manipulations. In addition, a 
few positions were changed for the reverse reason, namely to 
remove restriction enzyme sites that might prove troublesome for 
these manipulations. It must be stressed however, that these 
few additional modifications were made without in any way 
compromising the underlying principle of using high bias 
codons. In those cases where restriction sites were added or 
deleted, the substituted codon would therefore no longer be the 
one with highest frequency, out would be replaced by one with 
slight ly lower preference. In no case was a codon which is 
decidely unfavourable ever used . 

Additionally, in the particular example described 
herein, the segment encoding, prochymosin B was further modified 
to become a segment which encodes prochymosin A (Moir et a_l. , 
1982). Tnese isozymes differ by only three amino acid 
substitutions i.e.. of the 366 amino acids which define chymosin, 
Val 139 , Ser 216 and Gly 286 in prochymosin B correspond to 
Leu 139 , cys 216 and Asp 286 , respectively, in prochymosin 
A. It is to be noted however, that since there is nothing 
fundamentally different between prochymosin A and B, the coding 
region which encodes the 3 isozyme could just as well have been 
used. 

The coding region of bovine prochymosin A is shown in 
Figure 1. Each row represented in Figure 1 identifies the amino 
acid sequence (top line), the nucleotide sequence of the natural 
coding region (middle line) and the nucleotide sequence of the 
optimized coding region (bottom line). In addition/restriction 
sites are identified either by * - " to indicate a site which 
has been deleted in the optimized segment or by " + " to 
indicate where a particular site has been added. It will be 
readily seen from Figure 1 that the optimized gene is very 
signif ic~arftTy~"3'if f eren T i n"s e q u e n c e " f r o m the natural 
counterpart. There are in fact 264 substitutions out of a total 
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of 1095 coding nucleotides, or greater than 24% difference. A 
significant number of these differences are in the third 
positions of many codons, as expected. However, many wholesale 
codu:; cr.^n^ecD nave also been made, in the cases of Serine, 
Arqinine and Leucine which are each encoded by two different 
fan: • ice ox codons which differ in more than the third position. 

Figure 1 also indicates the coding region of bovine 
chynosin A since this region is contained within the prochymosin 
region illustrated. The synthetic chymosin coding region is 
representee t>y the base sequence beginning at arrow A on Figure 
1 and terminating where indicated for prochymosin vectors 
containing it, represent an additional, preferred embodiment of 
the present invention. 

The optimized gene for prochymosin A, as shown in 
Figure 1 was therefore synthesized using several modifications 
to the accepted automated procedures for synthesizing 
ol igodeoxynucl eot ide s , resulting in the capability to make 
longer oligonucleotide chains. The overall design strategy for 
the assemDly of the synthetic gene made use of fewer 
oligonucleotides of longer average length than is currently the 
norm. This enabled construction of the entire gene in fewer 
overall steps. 

The entire gene was comprised of 28 separately 
synthesized oligonucleotides (14 pairs, whose sequences were 
precisely complementary, except for terminal overhangs). These 
oligonucleotides ranged in length from 5 9 to 102 nucleotides. 
The full sequences and designations of these oligonucleotides 
are shown in Figure 2 referenced in more detail hereinafter. 
The oligonucleotides were separated into subassemblies 1 through 
5. The oligonucleotides comprising each individual subassembly 
were assembled separately into an appropriate (commercially 
available) M13mp vector. In some cases (subassemblies 1, 2 and 
3), small extensions were added to the complementary part of 
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oligonucleotides comprising one end of the final subassembly, 
strictly to facilitate the intermediate cloning and sequencing 
steps. These small extensions were left behind when the actual 
prochymosin gene portion was subsequently removed. 

The prochymosin encoding portions of the various 
subassemolies., were then excised out of their intermediate 
plasmid hosts, and assembled into the final gene. The 
prochymosin gene was placed under the control of one of three 
different promoters and secretion signals: the melibiase 
promoter and signal for expression and secretion in the yeast , 
Saccharomyces cerevisiae , i.e. plasmid p4 ATCC 53360 and either 
the Aspergillus niger glucoamylase promoter and secretion signal 
i.e. plasmid pGL2C, ATCC 53367 or the Aspergillus nidulans 
alcohol dehydrogenase promoter and a synthetic signal for 
expression and secretion. by Asperg il lus i.e. plasmid pALCAlS, 
ATCC 53368. The overall strategy for the assembly of the gene 
is shown in Figures 3, 4, and 5. 

Synthesis of Oligonucleotides 
a ) Ma ter i al s 

Oligonucleotides were prepared oh a Biosearch SAM ONE 
Series II or Applied Biosystems 380B DNA synthesizer. 
Me thyl-N,N-d iisopr opylaminoph osphoramidites were obtained from 
Applied Biosystems (ABI) or Beckman Instruments. 

fi-Cy anoethy 1-NrN-diisopropylaminophosphoramidites were from ABI , 
American Bionetics (ABN), or Biosearch Inc. Nucleoside- 
derivatized controlled pore glass supports (CPG) were purchased 
from ABI or ABN. "Low-loaded CPG" was purchased from ChemGenes 
Corp . Bis-(B-cyanoethy D-N/N-diisopropylaminophosphoramidite 
was obtained from ChemGenes. All other synthesis grade solvents 
and reagents were purchased from ABI or were prepared in-house 

- according to ABI protocols. Tetrazole was obtained f roin..ei th.er„ 

Cruachem or Aldrich Chemicals. Acetic anhydride (ACS reagent 
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grade), dich loroacet ic acid ( DC A, 99+%) r 4-d imethy laminopyr idine 
(DMAP, 99%), 2 , 6-lutidine (97%) and iodine (Gold Label) were 
purchased from Aldrich. Te tr ahydrof uran (THF) and acetonitrile 
(MeCN, both HPLC grade) were purchased from Caledon 
Laboratories. The acetonitrile was rigorously dried by 
refluxing over calcium hydride and was distilled fresh just 
prior to use. 

b ) Synthesis 

All of the prochymosin gene oligonucleotide components 

r 

were synthesized using phosphoramidi t e cnemis try (Beaucage and 
Carruthers, 1981; McBride and Carruthers, 1983) on either the 
Biosearch or A3I instrument according to modified Biosearch or 
ABI protocols, respectively. These modifications are described 
below. 

Syntheses carried out on the Biosearch SAM ONE were run 
on a 0.3 /jmole scale of starting nucleoside instead of the 
standard 1 jimole scale. The standard 20 minute AMI DI TE program 
was used. This program utilized a 2.5 minute coupling with 
25-30 umoles of ph osph oramidite (83- to 100-fold molar excess). 
Dichloroacetic acid (DCA, 2.5% in dichloromethane) was used 
instead of the stronger trichloroacetic acid (TCA) for removing 
the dimethoxy trity 1 protecting groups (de tritylation step). The 
CPG-1 inked oligonucleotides were cleaved manually, directly in 
their columns by ammonium hydroxide, at room temperature, then 
completely deprotected by heating at 55°C for 12-16 hours. 

Syntheses on the ABI 380B were carried out using a new 
ABI small-scale synthesis program (ssb003) . The synthetic scale 
was further reduced from 0.2 /imoles to 0.1-0.15 jimoles of 
starting nucleoside thereby increasing the effective molar 
excess of phos.phoramidite from 25-fold/ to 33- to 50-fold (for 5 

pmoles amidite per coupling). For. syntheses^ of _very long 

oligonucleotides O60mers) the ChemGenes "low-loaded CPG (5-10 
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jjmoles nucl eoside/gr am CPG) was used instead of the usual 25-35 
^moles/gram CFG. The DC A treatment was increased from 5 x 10 
sec. to 7 x 10 sec to ensure complete reaction during long 
oligonucleotide syntheses. Coupling times were increased from 
35 sec. to 60-90 sec. to increase coupling efficiency. 
Oligonucleotides were cleaved automatically from the CPG on the 
synthesizer, then deprotected as above. 

Iwo xinor modifications were made to the 380B 
synthesizer to expand its synthetic capacity. A fourth 
acetonitrile reservoir was added to maximize the wash capacity* 
The scb003 synthesis program was altered to access each of these 
acetonitrile reservoir s equally during a cycle. in order to 
increase the solvent/reagent, waste collection capacity the 
standara 2 litre waste bottle was modified to drain continuously 
into a 20 litre reservoir. 

c ) Purifi cation 

The oligonucleotides were purified by standard 
polyacrylamide gel electrophoresis methods ( ABI User Bulletin 
No. 13, 1984), using 8-10% gels (1.5mm x 40cm x 16cm). Half of 
the crude oligomer was loaded into 2-4 10mm wells and 
electrophoresed for 6-8h at 400-500 volts. Product bands were 
visualized by UV -shadowing , excised from the gel, and the 
oligonucleotide electr oeluted from the gel slices using an 
elec troelution apparatus (International Biotechnologies Inc.). 
The olignucleotides were eluted for 15-30 minutes at 120 volts 
into a 10M ammonium acetate solution, then desalted on C18 
Sep-Pak cartridges (Waters Associates).. Yields were determined 
by the absorbance at 260nm (OD260 units) on a Beckman DU8B 
spectrophotometer. 
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Enzymatic 5 ' - Ph osph ory la t i on . o f 01 i go nucleotides 


a ) Analytical Scale Phosphorylation with [32P]-ATP 

Small scale phosphorylations were used to check the 
purity of the oligonucleotides, and to establish estimates of 
the amounts of OD260 material that was actually capable of being 
phosphorylated . This latter determination was used to adjust 
the molar concentrations of oligonucleotides in the subsequent 
ligations. 

Reactions were performed in buffer containing 50mM 

Tris-HCl (pH 7.5) , lOmM MgC12, 10 mM DTT, and 1 . OmM Spermidine. 

Mixtures contained 1 pTnol/ul of oligonucleotide, 0.2 pmol/ul 
3 2 

[gamma- *P] ATP (3uCi/pmol) , 25 pmol//il of ATP, and 5-10 units 
of T4 polynucleotide kinase (P.L. Biochemical s) . Incubation was 
for 40 min. at 37°C, followed by 10 min. at 65°C. 

The labelled oligonucleotides were checked for purity 
by electrophoresis, under denaturing conditions, through a gel 
of 10% polyacrylamide containing 7M urea. The effective 
oligonucleotide concentration was determined by relating the 
actual number of Cerenkov counts recovered per band, to the 
theoretical amount expected, based upon the amount of O.D. 260 
material, and assuming 100% efficiency for 5 1 -end labelling . 

b) Large Scale Phosphorylation with [32P]-ATP 

For subassembly 2, a larger scale phosphorylation was 
performed. This was identical to that described above, except 
that the oligonucleotide concentration was quadrupled to 4 
pmol/jil, and the ATP concentration was increased to 100 pmol/jul. 

Vector Preparation ■ * 


Each of the various subassemblies described below were 
cloned into one of either M13mpl0, M13mpll, Ml3mpl9, or pBil322 
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(all of which are commercially available). These vectors were 
digested with the appropriate restriction enzymes, as described 
below, all according to manufacturers' specifications. 
Dephospho ry lat ion of S'-termini, when utilized, was accomplished 
using commercial preparations of either calf intestinal alkaline 
phosphatase or bacterial alkaline phosphatase, according to 
manufacturers' specifications. Vector fragments to be ligated 
to oligonucleotide suoassemblies were purified by 
electrophoresis through gels of low-melting agarose (Bio-Rad), 
followed by isolation of the DNA by phenol extraction of the 
melted gel slice, according to widely accepted procedures (e.g. 
Maniatis et al., 1982). 


Ligation of Oligonucleotide Subassemblies 

a) "Shotgun Ligation" 

In a typical experiment, ph osphory lated 
oligonucleotides comprising one entire subassembly, were mixed 
together in annealing buffer (50mM Tris-HCl (pH 7.5), lOmM 
MgC^)/ at a f inal concentration of approximately 100 nM each 
(i.e. 1-5 pmol of each oligonucleotide in a total volume of 
10-50 ja 1 ) • The mixture was heated to 95°C for 5 min. , followed 
by slow cooling to room temperature. The appropriate vector was 
then added, in proportions ranging from about 1/10 molar to 
approximately equimolar, with respect to the oligonucleotides 
added originally (i.e. for 2 pmol each of oligonucleotides, from 
0.2 pmol to 2.0 pmol of vector was added) . To this mixture was 
then added 1/10 vol of lOx ligation buffer (lOx = SOOmM Tris-Cl, 
pH 7.5, lOOmM MgCl2, 200mM DTT, lOmM ATP, ImM Spermidine and 500 
/ag/ml BSA) and 2.5 units of T4-ligase (P.L.- Biochem. j. Ligation 
was carried out at 15°C for 12-18 hours. 

b) "Block Ligation" 


In the case of subassembly 2, the six internal 
oligonucleotides (i.e. all except the two that would contribute 
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the ultimate 5'-ends) were individually ph osphory lat ed with 
[ P]-ATP in a larger scale reaction, as described above. The 
six phosphorylated , labelled oligonucleotides were mixed, in 
equal proportions, with the remaining two, non-phosphory lated 
oligonucleotides, and were annealed. Ligation was carried out 
in the usual fashion, except that (1) the reaction mixtures 
contained at least a 5-fold higher concentration of 
oligonucleotides (i.e. 20 pmoles each in a final volume of 40 
/il), and (2) no vector was added. This modified ligation 
mixture was electr ophoresed through an 8% polyacry lamide gel. 
The correctly ligated "block" was identified on an autoradiogram 
by virtue of its size, compared to standard molecular weight 
markers. Tne gel band was excised, the DNA within it was 
electr oyeluted , and the material was concentrated and purified. 
The purified "block" was then mixed with an equimolar amount of 
the appropriate vector, and ligation was carried out in ligation 
buffer as described above. 

T ransformation 

Frozen, competent cells of coli strain JM109 were 
prepared, and transfected by standard procedures (e.g. Maniatis 
et al . , 19 82). 

Analysis of Transf ormant s 

Putative transf ormants were analyzed by performing a 
variation of the "mini-screen" described by Holmes and Quigley 
(1981). Those candidates which looked correct were further 
analyzed by full sequence analysis by the dideoxy chain 
termination method of Sanger et a_l. (1977) 

Assembly of the 5' Half of the Prochymosin Gene 

;„.. subassembly- pCHYK IBB, --illustrated in Figure 3, was • 

formed from oligonucleotides IB and lfi (the sequences of these 
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and all other component oligonucleotides are shown in Figure 
2). The annealed oligonucleotides formed a Ps t I site on one 
end, and an Eco RI site on the other, and were cloned into 
Ml3mpl9 that had been digested with both these enzymes. The 
prochymosin gene portion is located between the Pst I site and 
an Xba I site which lies just ahead of the Eco RI cloning site, 
so that the fragment could eventually be isolated by digestion 
with Pst I and Xba I. (Figure 3) 

Subassembly pCHYM 2-4 (Figure 3) was formed from 
oligonucleotides 2A, 23, 2C, 2D, 2oL , 23, 2 *t , and 26 . 
These were annealed, ligated and purified from an acrylamide gel 
in a "block" as described above. The completed block is bounded 
by the restriction sites for Xba I and Eco RI , and was- cloned 
into Ml3mpl9 digested with these enzymes. The relevant portion 
of the prochymosin gene lies between the Xba I site and an Xma I 
site just ahead of the Eco RI cloning site. 

The actual 5'-end of the prochymosin gene was provided 
in two alternate sets of oligonucleotides, 1A and Itfas well as 
1A(Y) and ( Y) , each pair of which has one end compatible 

with Bgl II and one end compatible with Pst I (Figure 3). The 
only difference is that the oligonucleotides with the "Y" 
designation have two additional nucleotides which have been 
added to ensure joining in the correct reading frame when using 
the yeast expression and secretion vector, pMV-1, as well as 
with the Aspergillus expression and secretion vector pGL2C. The 
alternative 5' end, comprised of oligos 1A and lot will ensure 
the correct reading frame when using the Aspergillus expression 
and secretion vector pAlcAlS. 

To form the larger assembly pCHYM 1A(Y)B2 or pCHYM 
1AB2-3, the annealed oligonucleotide pair representing the .5' 
end (either lA(Y) + loc (y) or ' lA+loc ) , the Xba i/pst I fragment 
from' pCHYM lBfl,-the- Xba I/Eco- RI -fragment- from pCHYM2-4, and 
Ml3mpl0 which had been digested with Bam HI and Eco RI, were 
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mixed in a molar ratio of 20:1:1:1 respectively, and were 
ligated in a single reaction. 

The final, relevant prochymosin gene portion could 
therefore now be excised from this vector by digestion with Sau 
3A and Xma I, (Figures 3 and 5) 

Assembly of the 3 '-Half of the Prochymosin Gene 

Subassembly pCHYM 3-3 (Figure 4)was formed from the 

component oligonucleotides 3A, 3B, 3C, 3 , 3fi, and 37f . The 

boundaries are delineated by an Xma I site and an Eco RI site. 
Cloning was done into Ml3mpl9 which had been similarly 
digested. The relevant prochymosin gene portion lies between 

the Xma I site and a Sal I site which lies just ahead of the Eco 
RI cloning site. (Figure 4) 

Subassembly pCHYM 4-19 (Figure 4) contains the 
oligonucleotides 4A, 4B, 40C , and 4B, cloned between the Eco RI 
and Hind III sites of Ml3mpl9. All of the inserted material is 
authentic prochymosin coding sequence. (Figure 4) 

The final subassembly , pCHYM 5-9 was formed from the 
remaining component oligonucleotides. These oligonucleotides 
form a prochymosin gene fragment bounded by Eco RI and Bam HI 
sites. They were assembled into Ml3mpl9 which had been digested 
with these two enzymes. (Figure 4) 

The component fragments from these 3 subassemblies were 
ligated into the larger structure r designated pCHYM 345-21. The 
Bam Hi/Sal I fragment from pCHYM 3-3, the Sal I/Eco RI fragment 
from pCHYM 4-19 and the Eco RI/Hind III fragment from pCHYM 5-9 
were all identified and isolated by electrophoresis through gels 
of low melting agarose. They were then ligated, in equimolar 
c o n c e n Yif^tTorTs^ iT/lPsTrigT e reaction/ to pBR 32 2 which had bee n~ 
digested with Bam HI and Hind III. 
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In this way, the relevant prochymosin gene portion 
could be excised with a combination of Xma I and Bam HI. 
(Figures 4 and 5) 

Expression Vectors from Yeast and. Filamentous Fungi 

The recipient secretion and expression vectors used in 
this work have already been fully described, and are shown in 
schematic form in Figure 5. The pMV-1 plasmid expression vector 
for S accharomyces cerevisiae is based upon the melibiase 
promoter and secretion signal contained on plasmid p4 
ATCC 53360. The filamentous fungal expression vectors are based 
upon either the Aspergillus niger glucoamylase promoter and 
secretion signal (plasmid vector pGL2C ATCC 53367), or the 
Aspergillus nidulans alcohol dehydrogenase promoter and a 
synthetic signal (plasmid vector pAlcAls ATCC 53368) • 

Each of these plasmid vectors has either a Bgl II or 
Bam HI cloning site at the end of the DNA sequence which encodes 
their respective secretion signals. The 5 f -terroini of. the 
prochymosin gene segments were designed to be compatible with 
either of these sites. (Figure 5) 

Assembly of Prochymosin Expression Vector pMV-l/ CHYM 105 

Plasmid pCH-YM 1A(Y)B2 was digested with Sau3A and Xma 
I, while plasmid pCHYM 345-21 was digested with Xma I. and Bam 
HI. In both cases the prochymosin-containing fragments were 
isolated and purified by electrophoresis through a gel of 
low-melting temperature agarose (2%). The yeast 
secretion/expression vector pMV-1 was digested with Bgl II, the 
5 '-phosphates were removed by treatment with bacteria 1 . alkaline 
phosphatase (IBI). The* linear, dephosphory lated form was 
purified by electrophoresis through a gel of low-melting 
temperature agarose (1%). Aliquots of the three melted gel 
fragments were mixed to give approximately equimolar proportion 


BNSDQCID: <GB 23001 1BA I > 


- 30 - 

of fragments, the ligation was carried out at 15° C for 18 hrs. 
The resultant prochymosin expression/secretion vector, 

pMV-l/CHYM 105 ATCC 67294 (Figure 6), was identified by 
restriction analysis, and confirmed by full dideoxy nucleotide 
sequence analysis. 

Assembly of Prochymosin Expression Vector pGL2c/CHYM 101 

Plasmid pCHYM ,1A(Y)B2 was digested with Sau3A and Xma 
I, while plasmid pCHYM 345-21 was digested with Xma I and Bam 
HI. In both cases the prochymosin-containing fragments were 
isolated and purified by electrophoresis through a gel of 
low-melting temperature agarose (2%). The Aspergillus niger 
secretion/expression vector pGL-2C ATCC 53367 was digested with 
3gl II, the 5 '-phosphates were removed by treatment with 
bacterial alkaline phosphatase (IBI). The linear, 
depnosphoryla ted form was purified oy electrophoresis through a 
gel of low-melting temperature agarose (1%). Aliquots of the 
three melted gel fragments were .nixed to give approximately 
equimolar proportion of fragments, and ligation was carried out 
at 15° C for 18 hrs. The resultant prochymosin 

expression/secretion vector, pGL2C/CHYtf 101 ATCC 67296 (Figure 
7), was identified and confirmed by restriction analysis. 

Assembly of Prochymosin Expression Vector pAlcAlsL/CHYM 103 

Plasmid pCHYM 1A(Y)B2 was digested with Sau3A and Xma 
I, while plasmid pCHYM 345-21 was digested with Xma I and Bam 
HI. In both cases the prochymosin-containing fragments were 
isolated and purified by electrophoresis through a gel of 
low-melting temperature agarose (2%). The Aspergillus nidulans 
secretion/expression vector pAlcAlS ATCC 53368 was digested with 
3am HI, the 5 ' -phosphates were removed by treatment with 
bacterial alkaline phosphatase (IBI). The linear, 
dephosphorylated form was purified by electrophoresis through a 
gel of low-melting temperature agarose (1%). Aliquots of the 
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three melted gel fragments were mixed to give approximately 
equimolar proportion of fragments, and ligation was carried out 
at 15° C for 18 hrs. Tne resultant prochyinosin 
expression/secretion vector, pAl'cAl sL/CH YM 103 ATCC 67295 
(Figure 8), was identified and confirmed by restriction analysis. 

Transformation into Yeast and Filamentous Fungi 

Transformation of a Ura~ yeast strain SC295 to Ura+ 
using expression vector pMV-l/CHYM 105 was carried out as 
previously described (Ito et a]^, 1984) . 

Co-transformation of ArgB- Aspergillus niger or 
Asperg illus nidulans with an ArgB+ marker plus either of 
pGL2c/CKYM 101 or pAl cAl sL/CH YM 103 was performed as previously 
described (Buxton et al. f 1985; Ballance et al., 1983). 
Positive transf ormants were identified by their ability to 
hyoridize to nick-translated Bgl II/Eco RV fragment from plasmid 
pGL2c/CHYM 10 3, in a modified colony hybridization. 

Expression of the Synthetic Prochymosin Gene by Yeast 

a) Expression of prochymosin-Specif ic Messenger RNA 

Polyadenylated mRNA was isolated, by standard 
practices, both from yeast cells that had been transformed with 
the prochymosin expression plasmid, pMV-l/CHYM 105, and from the 
untransf ormed parental cells. The mRNA was blotted onto 
GeneScreen membranes (NEN) according to manufacturer's 
instructions. The blots were hybridized to nick-translated 
probes derived from piLasmids containing portions of the 
synthetic prochymosin gene. Strong positive signals were 
detected from the transformed strains, when grown in the absence 
of glucose. Considerably weaker signals were detected in the 
transf ormed "strains that' had been grown in medium containing 
glucose, a condition which is well documented as being 
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repressing for the melibiase promoter which is herein used to 
control the expression of the prochymosin gene (e.g. Friis and 
Ottc.-r.3rii, 1959). By comparison, no hybridization was detected 
: n tr.v ur* transformed parent, in either growth condition. This 
i'nciicjU-L that the transformed strain is indeed expressing a 
messenger KNA which is specific for the synthetic prochymosin 
cocir.r; region, and that the expression of the synthetic gene is 
capjce of oeing controlled by appropriate biological switches. 

b ) it. p.: r o I on \ cal Detection of Prochymosin Antigens 

tolyclonal antiserum directed against commercially 
preparri: ( iuthf-ntic bovine chymosin was derived, from rabbits, by 
standard procedures. The ant i-chyinosi n antiserum was- used in an 
ELI£A tort, utilizing u rease— con jugated anti-rabbit IgG as a 
test cyuten (Allelix UREIASE Reagents). Un transformed parents 
and t r anr.f ormed cells were grown to stationary phase, and the 
medium was concentrated by ultrafiltration through an Amicon 
P-10 membrane. Samples of the concentrated media were tested by 
the ELI SA for the presence of (p.ro) chymosin -specif ic antigen. 
Once again, the transformed cells produced a positive response 
whereas the un transformed parents showed only a weak 
(background) response. This indicates that prochymosin is 
indeed being secreted from the yeast cells and can be found in 
the extracellular medium. 

c ) Biological Milk-Clotting Activity 

As a final test of prochymosin activity, the 
concentrated, extracellular medium from cells grown to 
stationary phase was tested for the ability to clot milk. The 
prochymosin secreted from these cells must first be processed to 
an active form by incubation at acid pH (e.g. Pederson e t al . y 
1979; Foltmann, 1979). Incubation at pH 2 produces 
pseudochy mosin, whereas incubation at pH around 4-4.5 produces 
chymosin. Therefore, the extracellular supernatants were 
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activated by titration to the desired pH, followed by 
neutralization. Alternatively, cells were grown in medium at a 
pH already low enough (e.g. pH 3.5) to cause activation of any. 
secreted prochymosin. Clotting was determined by a method 
similar to that of Emtage e_t aJU , 1983, using skim milk powder 
(Difco) reconstituted in 10 mM CaCl 2 (10% w/v ) . Using this 
assay, concentrated medium from transformed cells was shown, 
after activation, to contain an enzyme activity which was able 
to clot milk. The medium from the un transformed parent showed 

no similar activity. 

Expression of the Synthetic Prochymosin Gene by Aspergillus 

a) Expression of prochymos i n-^Specif i c Messenger RNA - 

In an experiment similar to that described above for 
yeast cells, Aspergillus nidulans cells that had been 
transformed with the prochymosin expression vector pGL2c/CHYtf 
101 were tested for the presence of mRNA which would hybridize 
to a prochymosin-specif ic probe.. A nick-translated DNA fragment 
from the synthetic prochymosin gene was shown to hybridize, with 
varying degrees of strength, to polyadenylated mRNA isolated 
from transformed cells. The differences in strength of- 
hybridization probably reflect differences in copy number of the 
integrated genes. By comparison, no hybridization was observed 
with mRNA from untr ansf ormed parental cells. 

b) Immunological Detection of Prochymosin Antigens 

in an experiment similar to that described above for 
yeast cells, Aspergillus nidulans cells that had been, 
transformed with the prochymosin expression vector pAlcAlsL/CHYM 
103 were tested for the presence of prochymosin in the* 
(concentrated) extracellular medium, by using the same 

ant i-chymosin antibody and UREIASE -based EL ISA me thod s . Once 

again, there was a positive reaction when using medium in which 
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transformed cells had been grown, but not from medium in which 
the un transformed cells had been grown. 

While specific reference is made herein to an optimized 
prochymosin coding region, those skilled in the art will readily 
appreciate that the same principle, of optimizing codon usage in 
a coding region in accordance with the codon bias of a 
particular recipient cell, is applicable to protein coding 
regions other than prochymosin. 

In addition, it will be appreciated that although the 
coding region may be synthesized specifically with a view to 
satisfying the codon preference in a given host, the coding 
region can be expressed by hosts other than the given host. As 
exemplified herein, the yeast optimized prochymosin coding 
region is also expressed by filamentous fungi. Accordingly, the 
utility of the optimized coding regions is not necessarily 
limited to transforming one particular host species or genus. 

It will be further appreciated that the vectors 
specifically disclosed herein are described for the purpose of 
exemplification. Further modification to the vector components 
such as the promoter region, the signal sequence and other 
components, may be carried out in order to enhance the level at 
which prochymosin is expressed and/or secreted by a transformed 
host. Once produced by the host, prochymosin can be converted 
readily to chymosin using standard procedures. 

Alternatively, vectors may be prepared as described 
which incorporate the optimized chymosin region only (as opposed 
to the entire region which encodes prochymosin). In that case, 
the "pro-" region may be deleted or may be modified in such a 
manner as to enhance expression of the chymosin enzyme by the 
transformed host. 
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CLAIMS: 


1. A chymosin-encoding DNA segment having a codon sequence 
optimized for expression in a foreign host. 

2. a prochymosin-encoding DNA segment having a codon 
sequence optimized for expression in a foreign host. 

3. The DNA segment according to claim 1 or claim 2 wherein 
said host is selected from yeast and filamentous fungi. 

4 # The DNA segment according to claim 3 wherein said host 

i s Saccharomyces sp . 

5. The DNA segment according to claim 3 wherein said host 
is Aspergillus sp _._ 

6. An expression vector comprising the DNA segment as 
defined in claim 1 or claim 2. 

-j m The expression vector according to claim 6 which 

comprises a promoter region operatively coupled with said DNA 
segment. 

8. The expression vector according to claim 7 which 
comprises a signal sequence operatively associated with said DNA 
segment . 

9. The expression vector according to claim 8 which 
comprises the promoter region and signal sequence of the 
melibiase gene of Saccharomyces sp . 

10. The expression vector according to claim 8 which 
comprises the promoter region and signal sequence of the 
glucoamylase" gene" of Aspe rgillu s sp. " 
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11. The expression vector according to claim 8 which 
comprises the promoter region and signal sequence defined on 
plasmid pAlcAlS ATCC 53368. 

12. The expression vector according to claim 8 which is 
selected from the plasmids pGL2C/CHYM 101, ATCC 67296; 
pAlcAlS/CHYM 103, ATCC 67295 and pMV-l/CHYM 105, ATCC 67294. 

13. Transformed cells comprising a DNA segment defined in 
claim 1 or claim 2. 

14. Transformed cells according to claim 13 which are 
yeasts. 

15. Transformed cells according to claim 14 which are of 
the species cerevisiae . 

16. Transformed cells according to claim 13 which are 
filamentous fungi. 

17. Transformed cells according to claim 16 which are of 
the genus Aspergillus . 

18. Transformed cells according to claim 17 which are of 
the species Aspergillus nidulans or Aspergillus niger . 

19. A process for producing prochymosin which comprises 
culturing transformed cells as defined in claim 13 under growth 
promoting conditions. 

20. A process for producing chymosin which comprises 
culturing transformed cells as defined in claim 13 under growth 
promoting conditions, and converting the prochymosin expressed 
thereby to chymosin. 


